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SUMMARY 


Several  subjects  each  provide  a  mean  and  a  standard  deviation  for  a 
single  uncertain  quantity,  expressing  thereby  their  own  opinions  of  the 
value  of  the  quantity.  In  the  paper  we  discuss  the  problem  of  incor¬ 
porating  all  these  judgments  into  a  single  probability  distribution  for 
the  quantity.  Under  simple,  normal  assumptions  the  resulting  procedure 
is  least  squares.  Disadvantages  of  this  are  alleviated  by  using  t- , 
instead  of  normal,  distributions.  Further  refinements  incorporate  extra 
information  provided  by  the  standard  deviations.  These  lead  to  non¬ 
normal  distributions  and  improvements  to  least  squares. 


ii 


TAELE  OF  CONTENTS 


1 . 0  INTRODUCTION -  1 

2.0  LIKELIHOOD  PRINCIPLE -  3 

3.0  SINGLE  SUBJECT,  NO  SCALE  INFORMATION -  6 

3.1  Subject  Trials -  g 

4.0  SEVERAL  SUBJECTS,  NO  SCALE  INFORMATION - 10 

4.1  Correlations  Between  Subjects - 11 

4 . 2  Example - 12 

5.0  t-DISTRIBUTION - 16 

6.0  SINGLE  SUBJECT,  WITH  SCALE  INFORMATION - JO 

6.1  Approximations - 22 

6.2  Example - 23 

7.0  SEVERAL  SUBJECTS,  SCALE  INFORMATION - 27 

8.0  DISCUSSION - 28 

REFERENCES - : - 29 

APPENDIX  A - A-  1 

A.  1  Single  Subject,  No  Scale  Information - A- 1 

A. 1.1  Subject  trials - A- 1 

A.  2  Several  Subjects,  No  Scale  Informa '-.ion - A- 2 

A.  2.1  Correlations  between  subjects - A- 2 

A.  3  Uncertainty  About  y  :  t-Distribution - A- 3 

A. 4  Products  of  t-Likelihoods - A- 4 

A.  5  Single  Subject,  Scale  Inf ormation - A- 5 

A.  5.1  Approximations - A- 6 

A.  6  Other  Forms  of  Scale  Information - A- 7 

A. 7  Several  Subjects,  Scale  Information - A-S 


iii 


1.0  INTRODUCTION 


We  are  concerned  in  this  paper  with  the  assessment  of  an  unknown,  or 
uncertain,  quantity  9;  for  example,  the  range  of  a  target  or  the  demand  for  a 
product.  A  subject  S  expresses  his  uncertainty  about  9  in  the  form  of  a 
probability  distribution  for  0 ,  or  at  least  provides  some  features  of  such  a 
distribution  :  for  example,  its  mean  and  variance.  In  the  case  of  a  target,  S 
may  be  a  sonarman  who  has  used  sonar  devices  to  assess  its  position.  With  the 
demand  for  a  product,  S  could  be  an  experienced  sales  representative.  The 
subject  need  not  be  a  human  being:  thus  the  sonar  device  might  directly 
produce  a  mean;  or  a  market  research  organization  might  carry  out  a  survey  of 
the  product's  likely  acceptability.  We  suppose  there  are  several  subjects, 

Si,  S2,.*.Sn,  each  producing  his  own  assessment  of  9 .  In  addition,  there  is 
an  investigator  N  who  receives  these  assessments  and  is  required  to  provide  an 
overall  probability  distribution  for  0.  Thus  N  may  be  the  ship’s  commander 
who,  in  addition  to  advice  from  the  sonarman  Si,  receives  range  estimates 
based  on  deflection/elevation  angle  S2,  and  an  Ekelund  range  S3.  In  the 
product  example,  N  may  be  the  board  of  the  company  producing  it,  receiving 
advice  on  demand  from  several  sources. 

The  standard  approach  used  in  problems  of  this  type  is  based  on  least  squares, 
in  which  the  various  estimates  (or  means)  for  0  are  combined  linearly  with 
weights  that  depend  on  the  stated  variances  and  possibly  on  any  correlations 
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perceived  between  the  subjects.  In  this  paper  some  refinements  of  least 
squares  are  proposed  that  take  into  account  more  features  of  subjects' 
probability  assessments. 

The  general  principles  behind  this  problem  have  been  discussed  by  Lindley 
et  ad.  (1979) .  The  case  of  the  subjectivist  assessment  of  probabilities 
for  events,  rather  than  quantities,  was  considered  by  Lindley  (1982)  and 
French  (1980).  The  least-squares  approach  has  been  used  by  Cohen  and 
Brown  (1980).  Morris  (1974,  1977)  had  important,  pioneering  papers  on 
the  subject.  DeGroot  (1980)  has  considered  the  important  problem  of 
how  subjects  might  improve  their  probability  judgments.  Bordley  and  Wolff 
(1981)  have  discussed  the  problem  in  the  light  of  the  impossibility  re¬ 
sults  of  Dalkey. 

Work  on  this  paper  was  performed  under  Contract  No.  N00014-81-C-0330  for 
the  Office  of  Naval  Research,  U.S.  Navy,  through  Decision  Science  Consortium, 
Inc.,  Falls  Church,  Virginia.  I  am  grateful  to  Rex  V.  Brown  for  many 
stimulating  discussions  on  the  topics  of  the  paper  and  many  related, 


practical  issues. 


2.0  LIKELIHOOD  PRINCIPLE 


We  first  discuss  a  general  principle  that  is  basic  to  all  the  analyses, 
beginning  with  the  case  of  a  single  subject  S.  Even  with  a  single  subject  S, 

N  still  has  a  problem  because  he  may  suspect  S  of  biases,  or  overconfidence. 

It  is  therefore  important  to  distinguish  between  S's  probabilities  for  ?  and 
N's:  they  may  well  be  different.  We  suppose  S  states  certain  features  of  his 

probability  distribution  for  0.  For  example,  he  may  state  the  three 
quartiles,  saying  that  for  him  the  probability  is  one  quarter  that  v  exceeds 
the  upper  quartile,  and  similarly  for  the  others.  In  this  paper  we  shall 
consider  only  the  case  where  S  states  his  mean,  m,  and  standard  deviation,  s, 
for  0.  (All  that  is  needed  is  that  S  provides  measures  of  location  and 
spread,  referred  to  as  m  and  s.)  If  in  addition  he  says  the  distribution  is 
normal,  then  his  probabilities  for  0  are  completely  specified,  but  we  shall 
not  assume  this.  Generally  then  S  provides  features  tj,  t2,...tm  of  his 
probability  distribution  for  6. 

In  addition,  N  will  have  a  distribution  for  0  which  we  write  p(r).  The 
notation  p( • )  for  probability  will  always  refer  to  N's  assessment,  not  S's. 
Having  received  S's  information  in  the  form  of  t-j ,  t2,  •••■£„,,  N  requires  to 
calculate  p(9|tj,  t2,...tm),  his  revised  distribution  for  r  giver,  that 
information.  He  should  do  this  by  Bayes  theorem 

p(9|ti,  t2 , • • .tm)  c*  p(ti ,  t2, . • .tm jP)p(O) .  (2.1) 
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In  explanation,  the  first  and  last  of  the  probabilities  in  the  formula  are  the 
ones  already  introduced;  namely  N's  assessment  of  £  with  and  without  S's 
information.  The  formula  says  the  former  is  proportional  to  the  product  of 
the  latter  and  a  new  probability,  p(tj,  tj  ,  •  •  •  tm  j  0 )  .  This  is  N's  probability 
that  S  will  announce  t-|,  t2....tm  when  the  value  of  the  quantity  is  truly  £. 
The  technical  name  is  likelihood  for  6  .  Thus  N  is  effectively  saying:  were 
the  range  (or  demand)  truly  to  be  6,  what  would  I  expect  S  to  say;  values  with 
high  probability  are  likely  but  those  with  small  probability  are  unexpected. 
Notice  that  these  are  probabilities  for  N  about  what  S  says:  they  are  not 
probabilities  for  S,  which  do  not  concern  N  except  through  the  features,  t-|  , 
t2,...tm,  provided. 

The  general  principle  is  that,  in  order  to  incorporate  S's  information  into 
his  assessment  (that  is,  to  change  p(6)  into  p { 0  j  1 1 ,  t2,...tm))  N  requires  his 
likelihood  for  p:  his  assessment  of  what  S  will  say  were  the  true  value  to  be 
6.  Notice  the  inversion  involved  here:  in  order  to  make  probability 
statements  about  6,  N  requires  probabilities  for  the  t's,  given  P .  Consider 
the  case  where  S  just  states  his  mean  for  v:  we  write  this  as  m,  replacing 
the  general  notation  t^.  Then  N  has  to  consider  the  probability  that  S  will 
state  m  when  the  quantity  is  0:  p(m]9).  If  this  distribution  has  mean  ■?  , 

that  is,  if  E(m|c)  =  P,  then  N  is  saying  that  in  his  opinion  (for  these  are 
all  judgments  by  N,  apart  from  the  value  of  m)  S  is  free  from  bias. 

If  E(m|p)  =0+2  then  N  expects  S  to  overestimate  by  2.  The  variance  of 
p(m|0)  describes  N's  appreciation  of  S's  precision  in  assessing  ;  a  low  value 
saying  he  is  precise,  a  high  value  expressing  little  trust  in  S.  We  therefore 
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see  that  the  likelihood  reflects  N's  opinion  about  S  as  an  evaluator  of  6. 
Readers  familiar  with  Bayesian  statistics  will  notice  that  N  is  treating 
tl'  aS  ^ata  to  ke  Processed  in  the  sane  way  as  any  other  data. 

The  fact  that  the  data  reflect  S's  opinions  does  not  affect  the  method 
of  analysis,  only  the  structure  of  the  likelihood. 

The  principle  extends  to  several  subjects.  If  S  and  S?  both  provide 
means  for  6,  and  m^;  then  N  will  require  for  insertion  into  Bayes 
formula,  p(m^,  m^lc),  the  probability  that  with  range  (or  demand)  6, 
will  say  and  S  ,  m^.  This  incorporates  not  only  the  concepts  of  bias 
and  precision  already  referred  to  but  also  any  correlation  between  and 
S^.  Thus  if  the  subjects  share  much  of  their  knowledge  there  may  be 
positive  correlation  between  them. 


3.0  SINGLE  SUBJECT,  NO  SCALE  INFORMATION 


Using  the  principle  we  study  first  the  case  of  a  single  subject  S  who 
states  his  mean  m,  and  standard  deviation  s ,  for  3.  Thus  the  sonarman 
may  say  that  he  thinks  the  target  is  at  10,000  meters  with  a  standard 
deviation  of  1,000  meters.  This  may  be  expressed  indirectly  by  his 
saying  that  he  is  95%  sure  that  the  target  is  between  8,000  and  12,000 
meters  away.  With  an  implied  normal  distribution,  these  values  are 
m  +  2s.  With  m  and  s  the  features  of  the  subject’s  distribution  for  3, 
the  investigator  has  first  to  assess  his  probability  that  S  will  announce 
values  m  and  s,  given  6:  p(m,  s  J  5 ) .  This  distribution  can  be  factored 

p (m,  s  j  9 )  =  p (m | s ,  8)  p(s|9),  (3.1) 

the  probability  p(s|9)  that  S  will  state  s  for  the  deviation,  and 
p(m|s,  9),  the  probability  that  S  will  state  m,  giver,  that  he  has  stated 
s.  We  make 

Assumption  1  p(s|S)  does  not  depend  on  6. 

In  other  words,  this  says  that  N  does  not  think  that  the  distance  the  target 
is  away  will  affect  S's  perceived  precision  of  his  estimate:  or  ,  in  the 
other  example,  the  company  thinks  that  the  representative  will  think  it  just 
as  easy  to  estimate  a  high  demand  as  a  low  one.  The  assumption  may  well  not 
be  true:  N  may  think  S  will  perceive  distant  targets  harder  to  range  than 
near  ones.  We  explore  the  relatively  simple  consequences  of  the  assumption 
before  proceeding  in  Section  5.0  to  the  more  complicated  case  where  it  is 


relaxed. 


For  this  case,  Bayes  formula  (2.1),  reads 

p (9  | m,  s)  p (m,  s  )  0 )  p(0) 

=  p (m | s ,  6)  p ( s | 6 )  p(9) , 

by  the  factorization,  and  if  p(s]0)  does  not  depend  on  t  it  may  be 
absorbed  into  the  constant  of  proportionality  to  give 

p(6|m,  s)o<.p(in|s,  6)  p(6).  (3.2) 

Assumption  2.  p(m|s,  0)  is  normal,  with  mean  a  +  £9  and  standard  devia¬ 
tion  ys. 

To  understand  this,  begin  with  the  case  £  =  1.  Then  a  is  a  bias  term. 

N  is  expressing  the  opinion  that  if  the  true  value  is  9,  he  expects  S 
to  announce  0  +  a;  or,  using  the  normality,  6  + a  is  the  most  iikely 
value  for  m.  If  6  +  a  is  expected,  how  far  away  from  it  is  S  likely  to  be 
This  is  described  by  y.  Using  a  95%  interval,  N  thinks  that  S  could 
state  values  for  m  as  much  as  2ys  from  the  expected  value.  If  y  =  1, 

N  is  effectively  agreeing  with  S's  assessment.  If  y  >  1,  N  is  inflating 
S's  value,  thinking  that  S  is  overconfident  in  his  assessment.  If  y  <  1, 

N  thinks  S  is  lacking  in  confidence.  For  example,  if  the  ship's  commander 
felt  that  the  sonar  device  did  not  take  into  account  all  the  errors,  he 
would  use  y  >  1.  The  company  who  felt  the  representative  was  unduly 
cautious  might  have  y  <  1. 
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Thus  a  and  y  allow  N  to  express  his  views  about  two  major  features  of  S's 
assessment  as  he,  N,  perceives  it;  namely  the  bias  and  the  confidence. 

The  remaining  value  6  allows  the  bias  to  change  with  5.  The  bias  is 
zero  at  0  =  a/(l-£)  and,  if  &  >  1,  increases  with  0.  Thus  a  =  0, 

f=l.l  allows  for  a  bias  of  10%  of  the  value  of  6  ;  overestimating  the 
range  by  10%.  In  much  of  the  subsequent  work  we  take  0=1,  removing 
the  bias  dependence  on  6. 

Note  that  although  N  judges  that  s  is  uninfluenced  by  0  (assumption  1) , 
s  still  plays  an  important  role  in  N's  final  judgment  of  0.  This  is 
because  he  uses  s  (through  ys)  to  say  now  reliable  he  thinks  m  is  as 
an  evaluation  of  0.  Thus  s,  on  its  own,  is  uninformative,  but,  in 
conjunction  with  m,  provides  useful  information. 

If  N  has  little  other  knowledge  of  0  except  that  provided  by  S,  it  is 
easy  to  show  (A.l)*  that  with  assumptions  1  and  2,  N's  distribution  for 
0  is  also  normal  with  mean  (m-a) /6  and  standard  deviation  yt  ^s  , 
specializing  to  (m-a)  and  ys  when  6=1.  In  that  case,  all  N  does  is 
to  correct  m  for  bias  and  adjust  s  for  confidence.  The  case  a  =  0, 
y  =  1  leads  to  N  agreeing  with  S's  stated  values.  N  is  sometimes  said 
to  think  S  is  calibrated. 

3.1  Subject  Trials 

We  therefore  see  that  in  order  to  process  S's  information  N  needs  two 
values,  a  and  y  (and  also  6  if  not  1).  It  is  important  to  realize  that 

•References  beginning  with  A  refer  to  sections  in  the  Appendix  to  th®  >^=per. 
Equations  therein,  begin  with  a. 
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a  and  y  express  N's  opinion  of  S  as  an  assessor:  for  example,  the  ship's 

commander's  view  of  the  sonar  device.  It  is,  therefore,  not  sensible  to 

say  that  N  does  not  know  the  values  of  a  and  y :  for  they  are  expressions 

of  what  he  does  know.  Nevertheless,  it  does  make  sense  to  say  that  N 

could  improve  his  knowledge  of  S,  and  hence  his  values  of  a  and  y,  by 

studying  S.  For  example,  ship's  trials  may  be  performed  in  which  S 

states  lrr  and  s^  in  trial  i  for  a  target  whose  true  range  (unknown  to 

him)  is  0..  With  data  (m.,  s.,  G.)  from  n  trials,  reasonable  values 
1  111 

for  a  and  y2  (  see  a. 3)  are  a  =  l(m.-0.)s.  "/is.  2  the  weighted  average 

iii  i 

bias,  and  ^ 2=  I  (m. -6  . -a)  2/s  .  2n. 

ii  i 

Even  this  treatment  may  be  unsatifactory  if  it  is  felt  that  trials 
carried  out  under  non-combative  situations  are  not  totally  reliable 
as  a  guide  to  what  might  happen  in  action.  Consider,  for  example,  just 
the  matter  of  confidence,  expressed  through  j  .  If  N  thinks  that  the 
sonar  device  is  less  reliable  in  action  but  that  S  will  stick  to  his 
training  scheme,  N  may  well  increase  y  from  y.  On  the  contrary,  if  N 
thinks  that  the  device  retains  its  reliability  but  S  loses  confidence 
in  the  face  of  the  enemy,  y  may  need  to  be  decreased.  The  point  is  that 
a,  B  and  y  are  judgmental.  The  judgmental  element  can  be  reduced  by 
trials,  or  other  data  collection,  but  can  rarely  be  eliminated. 

An  alternative  treatment  of  uncertainty  about  y  that  has  certain  advan¬ 
tages  will  be  discussed  in  Section  5.0. 


-9- 


4.0  SEVERAL  SUBJECTS,  NO  SCALE  INFORMATION 


Consider  next  the  case  of  several  subjects  S,  ,  S  ,...S  each  providing 

1  2  n 

means  and  standard  deviations  (m.,  s.),  where  N's  opinions  of  each  S 

li  i 

satisfies  assumptions  1  and  2,  so  that,  for  each  S.,  N  has  a  triplet 
(Co,  0.,  j.):  thus  N  may  have  different  opinions  about  the  biases  and 
confidences  of  the  subjects.  What  is  N's  overall  opinion  of  r  in  the 
light  of  all  this  information?  It  is  not  possible  to  answer  this  ques¬ 
tion  without  considering  another  aspect  of  the  subject's  assessments, 
namely  their  potential  correlations.  For  example,  if  s  overestimates 
0(m^  >  0),  will  S0  tend  to  do  the  same?  If  this  is  not  so;  that  is,  if 
the  judgments  of  the  subjects  are,  in  N's  view,  unrelated,  we  say  N 
considers  them  to  be  independent.  In  that  case  the  effect  of  each  S. 
is  to  multiply  N's  probabilities  by  the  likelihood  for  S^,  p(nn]s  ,  0),  as  in 
equation  (3.2);  and  the  final  result  (a. 4)  is  to  take  the  individual 
means  (nt  -oO/S^  and  calculate  a  weighted  average  of  them,  the  weights 

being  the  inverses  of  the  individual  variances  y*:i?'2.s:,  termed  the 

l  ii 

precisions.  The  final  precision  is  the  total  of  the  individual  preci¬ 
sions  (a .  5)  . 

In  the  case  where  N  considers  each  S.  to  be  calibrated  (a.  =0,  h.  =  =  1) 

i  ill 

this  is  the  usual  least  squares  procedure  mentioned  earlier.  The  general, 
uncalibrated  case  uses  a  modification  of  that  procedure,  adjusting  the 
raw  (m  ,  sj  before  combining  by  least  squares. 
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4.1  Correlations  Between  Subjects 


If  N  does  not  judge  the  subjects  to  be  independent,  then  he  has  to  make  a 
judgment  about  the  extent  of  the  dependence.  If  he  continues  to  think 
that  the  various  values  s^  do  not  depend  on  6  (or  more  precisely, 
p(s^,  sl(...s  |0)  does  not  involve  G,  in  generalization  of  assumption  1) 
then  he  can  specify  that  the  m's,  given  the  s's  and  0  are  multivariate 


normal,  with  mean  and  variances  as  in  assumption  2  but  with  correlations 


p. .  between  m.  and  m.. 
il  13 

given  in  (A. 2 . 1 ) . 


A  new  weighted  average  results: 


details  are 


There  is  a  difficulty  here  that  is  perhaps  worthy  of  comment:  namely 
how  is  N  to  make  a  reasoned  judgment  about  the  correlations?  We  offer 
two  suggestions.  Since  correlation  is  a  connection  between  only  two 
subjects,  it  suffices  tc  think  of  a  pair  of  subjects,  and  S7.  The 
distribution  of  m^,  given  s^  and  G,  is  normal  with  mean  a  +  0^6  and 
standard  deviation  y^s^.  Consider  next  the  distribution  of  m^ ,  given 
(as  well  as  s^,  s^  and  G) .  In  other  words,  suppose  N  had  already 
received  ' s  judgment,  what  would  he  expect  S^'s  to  be?  This  requires 
a  mean  and  a  variance.  Standard  results  show  that,  using  a  bivariate 
normal  distribution  for  m^  and  m0,  the  latter  is  y2  si  ( 1  -  p‘-,)  instead 
of  the  s2  had  S_  been  considered  on  his  own.  Thus  1  -  p2  is  the  pro- 

2  2  2  i 

portionate  reduction  in  the  variance.  A  correlation  of  1/2  reduces  the 
variance  by  25%.  Hence  the  correlation  may  be  interpreted  as  a  variance 
reduction. 
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The 


An  alternative  interpretation  is  through  the  mean  of  given  m^. 
normal  distribution  implies  that  this  is  equal  to 


a2  + 


e2e 


+  (mi  ‘  ai  "  Bl6  i012y2S2/yiSl' 


This  has  the  form  of  the  original  mean  plus  a  multiple  of  the  deviation 
of  m^  from  its  expectation;  the  multiple  involving  the  correlation  re¬ 
quired.  Consequently,  the  correlation  has  an  alternative  interpretation 
in  the  mean,  both  interpretations  should  be  used  to  give  a  check  that 
the  correlation  is  indeed  of  the  magnitude  suggested  by  either  approach. 


The  following  comment  on  independence  and  correlation  may  be  appropriate. 
In  a  sense,  subjects  judging  the  same  quantity  are  bound  to  be  correlated 
merely  because  they  are  considering  the  same  quantity.  Thus  if  6  is 
large,  all  the  nr  will  tend  to  be  large.  It  is  not  this  sort  of  corre¬ 
lation  that  is  used  in  the  argument  presented  here.  That  correlation, 
or  independence,  is  for  a  given  6  and  concerns  the  fluctuation  of  the 
m's  from  that  6.  Thus  we  have  to  ask  whether  if  overestimates  6*, 
m^  >  6,  is  likely  to  do  the  same.  Later  (Section  6.0)  similar  con¬ 

siderations  will  apply  to  the  scale  information:  if  thinks  it  is 
hard  to  estimate  9  (s^  large),  will  think  so  too? 

4 . 2  Example 

This  section  concludes  with  a  numerical  example  adapted  from  Cohen  and 
Brown  (1980).  Table  1  refers  to  three  subjects.  The  first  two  columns 
give  the  mean  range  and  standard  deviation  (in  meters)  for  each  of  the 
subjects  for  a  single  target.  thought  the  target  to  be  nearest  but 
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TABLE  1 


m  s 

S  8000  2500 

52  14000  625 

53  9650  900 


a  £  Y 

0  1.0  0.8 

-500  1.1  1.2 

-450  1.0  1.0 


12 


=  P 


23 
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was  very  uncertain.  was  much  more  confident  that  it  was  far  away,  and 

was  intermediate  both  in  range  and  error.  The  next  three  columns 
give  N's  assessments  for  the  three  subjects  through  the  values  of  a,  6  and 
>.  is  thought  to  be  without  bias  but  to  overestimate  the  error. 

is  thought  to  be  without  bias  at  5,000  meters,  but  the  bias  increases  by 
10%  of  the  distance  over  that  amount;  whereas,  he  tends  to  underestimate 
the  error.  has  a  constant,  negative  bias  but  produces  an  estimate  of 

error  that  N  trusts.  Finally,  and  are  thought  to  be  rather  closely 
correlated  with  =  i  or  p  =  0.71;  so  that,  using  the  first  interpre¬ 

tation  above,  the  variance  of  is  reduced  by  one-half  on  N  learning 
about  S^'s  evaluation.  S  is  a  source  independent  of  the  other  two. 

Table  2  shows  in  its  first  two  columns  the  corrected  values  for  each 
subject  separately.  Notice  how  the  estimates  and  the  errors  are  made 
more  compatible  than  they  were  originally.  The  next  three  columns  of 
Table  2  show  the  inverse  of  the  variances  and  covariances,  the  elements 
a  ^  in  the  notation  of  the  Appendix  (A. 2.1).  Combination  by  (a. 6)  and 
(a. 7)  finally  yield  a  mean  range  of  11,160  meters  with  a  standard  devia¬ 
tion  of  540  meters. 
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TABLE  2 


(m-a) /6 

8000 

13180 

10100 
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5.0  t-DISTRIBUTIONS 


It  has  been  emphasized  that  the  values  of  a,  c  and  y  are  known  to  N, 
though  they  might  be  improved  by  data  from  planned  trials.  It  was  sug¬ 
gested  that  estimates,  like  y,  obtained  from  such  trials  could  be  used 
by  N  in  subsequent  evaluations  using  S's  judgments.  We  return  to  the 
matter  using  a  more  refined  analysis  which  has  advantages  in  a  quite 
different  direction.  For  simplicity,  attention  will  be  confined  to  the 
case  where  S  is  known  to  have  no  bias,  a  =  0,  £  =  1 ,  so  that  interest 
concentrates  solely  on  y. 

What  N  has  to  provide  is  p(m]s,  6),  or  more  fully  p(m|s,  ‘  , ),  since 
this  distribution  depends  on  y.  Let  D  denote  the  data  obtained  in  trials 
of  the  sort  described  above,  leading  to  N's  opinion  about  \,  p(y|D).  Then 
y  used  above  is  the  mode  of  this  density.  A  complete  analysis  would  not 
merely  replace  y  by  y  but  would  use  the  basic  result  that 

p  (m  |  s ,  0 ,  D)  =J"p(mjs,0,Y)p(YiD)dY.  (4.1) 

In  words,  the  probability  for  m,  given  s,  9  and  D,  is  obtained  by  taking 
the  same  probability,  given  y  rather  than  D,  and  averaging  it  over  the 
distribution  of  y  given  D.  It  is  shown  in  the  Appendix  (A. 3)  that  the 
result  of  this  calculation  is  that  p(m|s,0,D)  is  no  longer  normal  but 
has  a  t-distribution. 
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The  original  density  p(m)s,0,y)  can  be  described  by  saying  (m-S) /ys  has 
a  normal  distribution  with  zero  mean  and  unit  standard  deviation.  The 
new  density  p{m|s,0,D)  is  such  that  (m-0)/gs  has  a  t-distribution  with 
(n-1)  degrees  of  freedom.  Here 


2 


9 


1 

n-1 


2 


So  what  happens  is  that  Y  is  replaced  by  g  (which  differs  from  the  pre¬ 
vious  Y  only  in  using  a  divisor  (n-1)  instead  of  n) ,  and  the  normal  dis¬ 
tribution  is  replaced  by  a  t-distribution.  The  latter  differs  from  the 
former  in  assigning  more  probability  to  numerically  large  absolute  values — 
remember  both  have  mean  zero.  The  difference  is  appreciable  for  small 
n,  but  rapidly  diminishes  for  large  n.  Thus  with  n  =  6,  the  familiar 
2  standard  deviations  of  the  normal  is  replaced  by  2\ .  However,  the 
use  of  t  has  an  unexpected  feature  which  is  illustrated  by  an  example. 


Suppose  there  are  two  subjects.  S1  has  m^  =  -2.24  and  y^  =  1,  S0 
has  m^  =  +2.24  and  y^s^  ~  ^  t'le  distributions  are  normal,  least 

squares  will  conclude  that  0  has  mean  0  and  standard  deviation  0.71. 

(A  weighted  average  of  -2.24  and  +2.24  with  equal  weights  gives  0;  assuming 
independence,  the  total  precision  is  1+1=2,  so  the  standard  deviation  is 
i//T-  0.71.)  Now  this  is  somewhat  unsatisfactory  because,  from  ’s  evidence, 
N  thinks  that  6  lies  in  the  interval  m^  plus  or  minus  two  standard  *.  via- 
tions,  that  is  (-4.24,  -0.24);  whereas  on  S^'s  evidence  the  interval 
is  (+0.24,  +4.24);  and  these  two  intervals  do  not  overlap.  Nevertheless, 

N  averages  the  two  to  give  (-1.42,  +1.42)  centered  around  0,  which  both 
S1 's  and  S^'s  evidence  thinks  is  unlikely.  What  has  happened  is  that 
least  squares  has  forced  a  compromise  between  two  conflicting  opinions 
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and,  worse  still,  says  that  the  compromise  value  is  more  precise  than 
either  of  the  original  ones.  (Note  that  the  compromise  interval  has 
width  2.84  against  4  for  the  two  original  intervals.) 

If  instead  of  N  saying  (m-0)/Ys  is  normal,  he  says  it  is  t  with  5  d.f. 
then  the  95%  interval  is  (-4.81,  0.33)  for  and  (-0.33,  +4.81)  which 
do  overlap  slightly.  Detailed  calculations  (outlined  in  the  Appendix 
(A. 4))  show  the  combined  interval  using  both  pieces  of  evidence  is 
(-2.91,  +2.91).  This  has  width  5.82  (compared  with  the  normal  value 
of  2.84)  which  exceeds  the  width,  5.14,  of  each  of  the  original  inter¬ 
vals.  Consequently,  the  compromise  centered  at  zero,  which  is  compatible 
with  both  and  S^,  admits  a  degree  of  uncertainty  over  twice  that  of 
the  normal  compromise  and  greater  than  either  subject  separately  suggests 
Thus  the  t-distribution  admits  that  the  discrepancy  between  the  subjects 
increases  the  uncertainty,  whereas  the  normal  distribution  ignores  the 
discrepancy  and  decreases  the  uncertainty. 

Suppose  the  value  2.24  is  increased,  say  to  3.0,  so  that  has  m^  =  -3.0 
has  m^  =  +3.0,  all  the  other  values  remaining  unaltered  while  the  dis¬ 
crepancy  is  increased.  The  normal  result  still  persists  and  produces 
the  combined  interval  of  (-1.42,  +1.42).  The  t-distribution  now  refuses 
to  admit  a  compromise,  and  the  distribution  for  0  given  both  pieces  of 
evidence  is  bimodal  with  modes  at  +  2.0,  and  a  local  minimum  at  zero. 

We  have  not  calculated  the  95%  interval  but  it  will  clearly  be  much 
wider  than  the  earlier  value  5.82. 
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The  replacement  of  the  normal  by  the  t-distribution,  therefore,  has  the 
great  advantage  that  compromise  is  not  forced  nor  is  spurious  accuracy 
claimed.  It  has  the  disadvantage  over  least-squares,  based  on  the  normal 
assumption,  that  no  simple  formulae  are  available  for  the  results  of 
combining  t-distributions .  Nevertheless,  in  these  days  of  good  calcu¬ 
lators,  the  computation,  once  the  program  is  written,  takes  only  a  matter 
of  minutes. 

Although  the  results  using  the  t-distribution  have  been  presented  in  the 
context  of  trials  leading  to  values  of  g  and  the  degrees  of  freedom 

V  =  n  -  1,  the  distribution  does  not  require  this.  Consider  the  fol¬ 
lowing  scenario.  N  was  originally  obliged  to  consider  >  and  found  it 
difficult  to  fix  on  a  value.  After  some  thought  he  announced  a  value  >  ^ 
but  said  that  it  could  be  out  by  a  factor  of  about  2.  This  factor  is 
related  to  the  value  of  v,  if  is  used  as  g.  We  will  see  how  to 
establish  this  relation  below,  but  all  we  need  note  for  the  moment  is 
that  uncertainty  about  >  can  be  incorporated  using  c  in  place  of  1  and 
t  for  the  normal,  V  incorporating  the  degree  of  uncertainty.  The  t- 
distribution  is  therefore  of  wide  applicability,  the  additional  quantity 

V  providing  substantial  flexibility  and  realism. 
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6.0  SINGLE  SUBJECT,  WITH  SCALE  INFORMATION 

The  discussion  so  far  has  used  assumption  1:  that  N  judges  that  the 
standard  deviation  quoted  by  S,  by  itself,  provides  no  information  about 
6.  We  now  relax  this  assumption.  A  possibility  that  will  be  investi¬ 
gated  is  that  S's  precision  for  6  decreases  with  0,  or  that  his  standard 
deviation  increases  with  6.  In  other  words,  N  thinks  that  S  will  find 
large  quantities  harder  to  estimate  than  small  ones.  This  is  reasonably 
true  in  both  the  examples  of  target  range  and  product  demand .  The 
simplest  possibility  is  to  suppose  the  standard  deviation,  s,  increases 
linearly  with  0.  This  may  be  expressed  by  saying  E(s|c)  =  ‘‘0.  For 
example,  with  X  =  0.1,  N  expects  that  S,  faced  with  a  target  10,000  meters 
away,  will  give  a  standard  deviation  of  1,000  meters;  but  at  twice  the  dis¬ 
tance,  the  deviation  will  be  doubled.  In  addition,  N  will  at  least 
have  to  think  about  how  much  s  might  depart  from  its  expectation.  The 
easiest  quantity  to  consider  is  the  coefficient  of  variation  of  s.  This 
is  defined  to  be  the  ratio  of  the  standard  deviation  (of  s)  to  the  mean 
of  s.  And  the  simplest  assumption  is  to  suppose  this  not  to  change  with 
6.  We  denote  this  constant  by  (6-1)  (The  notation  is  not  deliber¬ 

ately  awkward:  a  simple  form  here  would  lead  to  complications  elsewhere.) 
For  example,  suppose  6=5,  giving  a  coefficient  of  0.5;  since  the  mean 
in  the  above  example  was  1,000  meters,  this  leads  to  a  standard  deviation 
(of  s)  of  500  meters. 
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A  final  consideration,  beyond  a  mean  of  A6  and  a  coefficient  of  varia¬ 
tion  of  (6-1)  *,  can  enable  N  to  fix  his  distribution  for  s,  given  ~. 

It  is  unreasonable  to  suppose  the  distribution  of  s  to  be  symmetric 
about  the  mean,  a  large  increase  being  more  likely  than  an  equally  large 
decrease.  An  increase  from  1,000  meters  to  2,000  {at  two  standard  devia¬ 
tions)  may  be  plausible,  but  a  decrease  to  zero  is  not.  This  suggests 
using  a  distribution  with  a  long  tail  to  the  right  and  a  short  one  to 
the  left,  towards  the  origin.  Such  a  distribution  is  obtained  by  sup¬ 
posing  A66/s  has  a  gamma  distribution  with  parameter  6:  briefly,  a 
T-  distribution.  The  Appendix  (A. 5)  provides  details.  In  applications 
it  is  probably  easiest  to  think  in  terms  of  the  logarithm  of  s,  In  s. 

This  is  approximately  normal  with  mean  In  >,  +  In  6  and  standard  deviation 
(6-1)  ^ .  Returning  to  the  numerical  example  with  A  =  0.1  and  6=5, 

In  s  is  about  normal  with  mean  In  9  -  2.30  and  standard  deviation  0.25. 
For  a  target  at  10,000  meters,  In  s  has  mean  6.91,  so  that  the  limits  at 
two  standard  deviations  are  6.41  and  7.41  which,  translated  back  into 
meters,  are  600  and  1,650  around  the  mean  of  1,000.  Thus  N  would  be 
surprised  if,  with  a  target  at  10,000  meters,  S  quoted  q  v  iue  for  s  mure 
than  650  above  or  400  below,  the  expected  value  of  1,000.  This  illus¬ 
trates  the  lack  of  symmetry  mentioned  above. 

The  logarithmic  transformation  can  also  enable  deviations  by  a  factor 
to  be  handled.  Thus  at  the  end  of  Section  5.0  it  was  suggested  that  y 
might  differ  from  Yq  hy  a  factor  of  2;  that  is,  limits  for  y  would  be 
2Yq  and  iy  .  On  taking  logarithms  the  limits  are  In  y0  +  In  2,  so  that 


a  standard  deviation  of  iln  2  =  0.35  is  indicated. 


T 


We  now  have  a  distribution  for  s,  given  9.  Retaining  the  normal  distri¬ 
bution  for  m,  given  s  and  6  (assumption  2)  Bayes  theorem  (1)  can  be  used 
to  find  N’s  distribution  for  6,  given  m  and  s.  The  details  are  given  in 
the  Appendix  (A. 5).  Essentially  the  result  is  a  distribution  for  0  of 
the  form  (a. 9) 


k  exp 


(Q-U) 
2a2  . 


-d+1 


e 


(6.1) 


where  K  is  a  constant,  6  is  as  before,  and  y  and  0 
exact  expressions  in  terms  of  m  and  s  are  given  in 

m-a 


V  = 


-'^-S  and  c  =  ys£_1. 

B2 


are  quantities  whose 
the  Appendix  as 


6.1  Approximations 


The  distribution  (6.1)  is  of  the  form  of  a  normal  kernel,  depending  on 
h  and  c,  multiplied  by  a  power  of  9.  It  is  not  an  easy  distribution 
to  handle  analytically  but  computation  with  it  is  straightforward  and 
will  be  illustrated  below.  However,  it  is  possible  to  obtain  approxi¬ 
mations  to  the  mean  and  variance  of  8.  We  do  not  suggest  that  these  are 
used  in  practice  but  they  are  presented  here  in  order  to  help  the  reader 
understand  the  effect  of  the  information  provided  by  s  alone.  The  results 
are  obtained  under  the  supposition  that  o/v  is  small.  In  any  case  o/u 
needs  to  be  less  than  3  for  otherwise  the  normal  kernel  in  (6.1)  extends 
into  negative  values  of  9. 

For  the  mean  the  approximation  is  (a. 12) 
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E(0  m,  s)  = 


m-cx 


1  + 


y2 (6+1) s 
(m-a) 2 


( 


s  - 


c 

5+1 


a  (m-a) 


) 


To  appreciate  this  recall  that,  with  assumption  1  in  which  s  alone  gave 
no  information  about  0,  the  mean  was  (m-a)/£.  The  effect  of  the  infor¬ 
mation  provided  by  s  is  to  multiply  this  value  by  something  in  square 
brackets  a  little  different  from  one.  Since  y2 (6+1) s/ (m-a) 2  >  0,  this 
correction  is  above  or  below  one  according  as  s  is  greater  than  or  less 
than  A(m-a)/S  times  6/ (6+1) .  Since  s  was  expected  to  be  XS  and  0  is 
about  (m-a)/B,  we  see  that,  ignoring  the  factor  5/(c+l),  the  mean  is 
increased  from  its  value  under  assumption  1  if  s  exceeds  its  expectation 
and  is  otherwise  decreased.  (The  factor  6/ (5+1)  arises  from  the  skewness 
of  the  distributions.)  Reflection  shows  that  this  makes  good  sense. 


For  the  variance  the  approximation  is  (a. 11) 


var (6 | m,  s) 


(5+1) 


*?  O 

XT  s- 

(m-a)  4 


which  is  always  reduced  from  its  value  y2s2/0‘  when  s  alone  contributes 
no  information,  the  reduction  being  due  to  the  extra  information  pro¬ 
vided  by  s.  Notice  that  the  reduction  increases  with  6,  or  increases 
as  the  coefficient  of  variation  of  s,  (6-1)  ^ ,  decreases;  again  making 
good  sense. 


6.2  Example 

We  emphasize  that  these  approximations  are  not  very  good  except  in  extreme 
cases.  So  let  us  now  turn  to  the  exact  calculations  in  a  specific  case. 
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N's  judgment  of  a  single  subject  S  was  expressed  by  the  following  values: 

a  =  1,000;  B  =  1;  Y  =  1;  X  =  0.1;  ■:  =  5. 

Thus  he  thought  S  has  a  constant  (£=1)  bias  of  1,000  meters  (u)  but  that 
his  standard  deviation  was  satisfactory  (y=l) .  He  expected  a  standard 
deviation  of  one  tenth  of  the  true  range  (\=0.1)  and  a  coefficient  of 
variation  in  it  of  } .  (These  are  the  numerical  values  of  >  and  1  dis¬ 
cussed  above.) 

The  subject  then  quoted  a  range  of  m  =  14,000  meters  and  a  standard 
deviation  of  s  =  1,250  meters.  The  values  >j,  c  and  ;  +  1  in  (6.1)  are 
easily  seen  ‘■o  be 

y  =  12,375;  0  =  1,250;  5+1=6. 

The  density  for  0,  equation  (6.1),  is  plotted  in  Figure  1  and  labelled 
A  =  0.1.  The  mean,  E(0|m,  s) ,  is  13,100  meters  and  the  standard  devia¬ 
tion  is  1,216  meters.  Without  the  information  provided  by  s,  these 
values  would  have  been  13,000  meters  and  1,250  meters,  respectively. 

The  mean  range  has  increased  by  1%  and  the  standard  deviation  has  de¬ 
creased  by  3%.  There  is  about  1  chance  in  20  that  6  exceeds  14,900 
meters  and  the  same  chance  that  it  is  less  than  10,900  meters. 

In  this  example,  N  expected  s  to  be  about  one  tenth  p=0.1)  of  6. 

Using  (m-a)/2  for  0,  this  is  13,000  and  s  is  in  reasonable  agreement 
with  this,  at  1,250.  To  illustrate  the  effect  of  A  suppose  it  had  been 
0.2,  so  that  N  anticipated  2,600  rather  than  the  low  value  he  did  receive. 
Notice  that  the  low  value  is  within  reasonable  limits  for  •?.  With  s 
expected  to  be  2,600,  Ins  is  expected  to  be  7.863  with  a  standard 
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deviation  of  4.  Two  standard  deviation  limits  are  therefore  6.863  and 
8.863,  which,  undoing  the  logarithm,  give  limits  of  960  and  7,070  meters 
Figure  1  graphs  the  density  for  0,  where  it  is  labelled  A  =  0.2,  having 
y  =  11,750,  but  0  and  6  as  before.  The  mean  is  12,500  meters  and  the 
standard  deviation  1,213  meters.  The  mean  has  decreased  by  500  meters, 
or  4%,  as  a  result  of  the  unexpectedly  low  value  of  s  but  the  standard 
deviation  is  unaltered.  There  is  about  1  chance  in  20  of  0  being  above 
14,200  meters  or  below  10,300  meters. 

Finally  suppose  we  revert  to  the  original  value  of  A  =  0.1  but  increase 
6  to  10  so  that  N  is  more  sure  about  the  value  of  s,  the  coefficient  of 
variation  decreasing  to  1/3.  Then 

y  =  11,875;  C  =  1.250;  6  +  1  =  11. 

The  density  is  also  plotted  in  Figure  1  and  labelled  5 =  10.  The  mean 
is  13,190  meters  with  a  standard  deviation  of  1,190  meters.  The  mean 
has  increased  by  14%  from  its  value  without  s  and  the  standard  deviation 
has  decreased  by  about  5%.  The  latter  reflects  the  stronger  information 
provided  by  s  due  to  the  larger  value  of  6.  It  is  clear,  however,  that 
the  effect  of  6  is  less  than  that  of  A.  This  seems  to  be  generally  true 
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7.0  SEVERAL  SUBJECTS,  SCALE  INFORMATION 


We  now  pass  from  the  case  of  a  single  subject  to  several  subjects.  We 

saw  above  how  to  handle  the  m's  using  a  multivariate  normal  distribution 

with  means  a.  6.0  for  S.,  variances  y2s2.,  and  correlations  c  .  The 
ill  11  ij 

result  was  a  normal  distribution  for  0.  Denote  its  mean  by  m^  and  its 

standard  deviation  by  s^.  (The  formulae  are  given  by  (a. 6)  and  (a. 7)  and 

were  illustrated  by  a  numerical  example  in  Section  4.2.)  We  have  also 

seen  how  to  handle  a  single  s^  by  supposing  In  s .  is  normal  with  mean 

In  y ,  +  In  S  and  standard  deviation  (6^-1)  If  several  subjects  are 

involved,  it  can  be  assumed  that  the  In  s.  have  a  multivariate  normal 

l 

distribution,  with  means  and  standard  deviations  as  just  stated,  and 
with  correlations  T^.  As  before,  using  (a. 6)  and  (a. 7)  again,  on  the 
evidence  of  the  s^'s  In  0  will  have  a  normal  distribution.  Let  its  mean 


be  written  In  and  its  standard  deviation  5^ 


-i 


Then  reversing  the 

connection  between  a  T^-distribution  and  the  normal,  this  implies  that 

to  a  good  approximation  has  a  6^  distribution .  This  may  be 

o 

combined  with  the  normal  density  for  0  obtained  from  the  means  with 
the  result  that  the  final  distribution  of  6  is  still  of  the  form  (6.1) 
above.  The  detailed  formulae  are  given  in  the  Appendix  (A. 7). 
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8.0  DISCUSSION 


The  description  of  a  subject's  ability  as  a  probability  assessor  through 
the  parameters  a,  f*,  y,  \,  and  6  allows  a  considerable  amount  of  flexi¬ 
bility,  yet  never  leads  to  distributions  more  complicated  than  a  normal 
kernel  times  a  power  (equation  (6.1)).  This  distribution  persists  even 
when  correlations  are  admitted  between  the  stated  means,  and  the  stated 
variances,  of  different  subjects.  Although  approximations  are  available, 
exact  computations  can  easily  be  performed  on  a  simple  calculator  and, 
previously  programmed,  are  both  simple  and  fast  to  use. 

The  parameters  describe  one’s  knowledge  of  the  subject.  Unfortunately, 
in  practical  situations  this  knowledge  is  not  extensive  and  the  para¬ 
meters  depend  almost  entirely  on  personal  impressions.  These  have  their 
place  and  are  often  of  great  importance  but,  nevertheless,  we  need  much 
more  experience  in  carefully  controlled  trials  of  subjects'  ability  as 
assessors.  A  lot  of  work  in  this  field  is  concerned  with  untrained 
subjects,  naive  in  their  knowledge  of  probability.  The  trials  needed 
are  with  highly  skilled  people,  like  sonarmen,  who  have  had  jrobability 
training.  Such  trials  would  also  help  to  determine  whether  the  assump¬ 
tions  of  normality,  gamma,  or  t,  made  in  this  paper,  are  realistic  decrip 
tions  of  subjects'  performance. 
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APPEND—* 


A.l  Single  Subject ,  No  Scale  Information 


Assumption  1.  p(s[8)  does  not  involve  0. 

Assumption  2.  p(mjs,9)  is  N (ot+&6 ,>s)  . 
Then 


p ( 6 } m ,  s)  °<  exp 

and  if  p(8)  is  uniform  over  the  effective  range  of  the  likelihood, 
p  (6  j  m,  s )  exp 
which  is  N  ( (m-cx)  /£,  >^1s) 


' 

1 1 

0- (m-a) /c  "  l 

2 

-i 

k 

IP  s  J  j 

(a.l) 


(a. 2) 


A.1.1  Subject  trials.  If  8  =  1,  and  n  trials  give  (nr  ,  ,  6h)  then  the 

likelihood  for  a  and  Y  is 


and  the  maximum  likelihood  estimates  are  easily  seen  to  be 

a  =  E (m. -6 . )s .  2 /2s  .  2 

iii  l 

and 


Y2=n1E(m.-0  -a.)2/s^ 
iii  i 


(a. 3) 


A-l 


For  several  subjects,  judged  independent,  (a.l)  gives  a  likelihood  for  0 


m. -a. -6 . 6 


giving  another  normal  distribution  with  mean 


and  variance 


ZB. (m.-a.)/y2st?/<  ZB./yrs2 
i  i  i  il  l  i  ll 


-O?  .  2  2 

'Wi 


(a. 4)  is  the  usual  weighted  average  of  the  individual  values  (m.-a.)/6 

ii  l 

with  weights  inversely  proportional  to  the  variances  y2s?/£‘ ;  and  (a. 5) 
adds  up  the  individual  precisions  (inverses  of  variances) . 


A. 2.1  Correlations  between  subjects.  If  correlations  between  the  sub¬ 
jects  are  included  by  replacing  the  separate  normal  distributions  for  each 
nn,  given  and  6,  by  a  multivariate  normal  distribution,  the  means  and 
variances  will  remain  the  same  at  a.  +  8.0  and  y.s.,  but  there  will  be 

l  l  ii 

a  covariance  p.  .y.y.s.s.,  yielding  a  correlation  p..  between  m.  and  m.. 

lj  i  j  l  J  i j  l  3 

For  ease  of  notation  write  c. .  for  the  covariance,  equal  to  the  variance 

when  i  =  j,  and  let  0  ^  denote  the  elements  of  the  matrix  inverse  to  the 

variance-covariance  matrix  with  elements  c. ..  The  multivariate  normal 

1 3 

distribution  gives  a  likelihood  for  0 


o<  exp 


-  4  I  (m.  -a .  -6 . 8)o^  (m. -a. -8.6)1 

2  ill  3  3  3  J 

-  i|e!£e.olis,  - 

l  2  1  J  1  1  2  J 


©<  exp 


, 

..  ' 

_ 

0  -  BE .o1"5  (m . -a  .  )/z£  ,c13S  . 

1  : 

2  i  3 

1  33/13] 

_ 

wh  're  all  summations  are  over  both  i  and  j.  The  near,  is 


I6.0lj(m.-a.)/I6.0ljt . 
i  3  3  i  3 


and  the  variance 


(rB.oljS.)_1  . 

i  3 


(a. 6) 


(a. 7) 


The  mean  is  still  a  weighted  average  of  the  individual  contributions 

(m.-aJ/S.  with  weights  Eg.o^B.. 

3  3  3  i  l  3 


A. 3  Uncertainty  About  Y:  t-Distribution 


With  a  =  0, 
(see  A.l. 1) 


3  =  1  in  n  trials  yielding  (rrr  ,  s^. 


-n 

Y  exp 


likelihood  for  ■  is 


and,  assuming  a  uniform  distribution 
portional  to  p(y|o).  If  we  write  g2 


for  Y  before  the  trials,  this  is  pro- 

=  -  — - ■  — — -  )  we  have  from  (-■>!) 

n-1  ^  s.  ) 


p(m| s, 


e,  d)  o< 


exp 


(m-6) 2 _  (n-l)g~ 

2rs:  2y2 


-  (n+1) 


dy 


and  the  standard  gamma  integral  shows  this  to  be  proportional  to 
(1  +  t2/v)  ^  where 


t  =  - ,  and  v  =  n  -  1 

gs 

so  that  the  quantity  called  t  has  a  t-distribution  with  y  degrees  of  freedom. 
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A. 4  Products  of  t-Likelihoods 


Suppose  that  N's  judgment  is  that  the  subjects  are  independent,  assump¬ 
tion  1  holds,  and  that  for  S . , (m . -a . -6) /B . g . s .  has  a  t-distribution  with 

i  ai  ill 

degrees  of  freedom,  replacing  the  normal  distribution  previously 
assumed.  Then  the  likelihood  for  6  from  all  subjects  is  the  product  of 
the  separate  t-likelihoods 

f  'I 


i 


1  + 


(m.-a.-O) 2 
i  i 


v. 82g2s2 

i  ii  i 


-i(v.+l) 


Such  products  are  not  easy  to  handle  analytically  though  computation  is 
straightforward.  A  simple  example  will  illustrate  the  principle  phenomena. 


Consider  two  t-distributions  with  the  same  degrees  of  freedom  \,=\1=V'2 ; 
Cr=0,  6^=1  (no  biases);  g^s^g^^s,  say  (e<3ua^  precision);  with  one 
centered  at  =  +m  and  the  other  at  m2  =  -m.  The  likelihood  is  then 


* 

i  + 

1  +  (e+n»2 

s2v 

s2v 

k  4 

raised  to  the  power  -i(v+l).  If  <p  =  0/sv  and  c  =  m/sv  , 
reduces  (a. 8)  to 


(a. 8) 


a  little  algebra 


1  + 


2 


apart  from  a  constant  multiple.  The  resulting  density  in  (J  is  therefore 

2  2  i 

bimodal  if  c  >1,  the  modes  occurring  at  $  =  +  (c  -1)  .  If  c‘  £  1,  there 

is  a  single  mode  at  $  *  0.  The  example  given  in  the  text  had  g^s^  =  s  =  1, 
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V  =  5 ,  and  m  =  2 


-24  =^5, 


so  that  c  =  1.  Later  m  was  increased  to  3.0 


giving  c  =  1.34/  and  two  modes.  The  interval  for  6  quoted  was  evaluated 
using  numerical  integration. 


A. 5  Single  Subject,  Scale  Information 


Assumption  3.  p(s|0)  is  such  that  \50/s  has  a  7.  distrii 

X  has  a  F^  distribution  if  its  density  is,  for  X  >  0, 

e“XXl5/5! 


with  6  > -1.  It  is  usual  to  write  X  =  and 


1,  and  refer  to  the 


\“-distribution  with  v  degrees  of  freedom;  but  the  \ ' s  that  occur  are  a 
nuisance  in  the  present  context.  The  density  for  s  is  then 

P (s  1 0)  =  exp  {  -  A -56/ s  }  (A5?) '+1/s'+“5 ! 

and  simple  calculations  show  that  E(s)  -  V?  and  var(s)  =  "■'■'/(; -1)  .  The 
coefficient  of  variation  is  therefore  (5-1)  ^ .  The  logarithm  of  s,  In  s ,  i 
approximately  normal  with  mean  ln(A9)  -  $(6-1)  ‘  and  standard  deviation 
(5-1)  ^ .  (In  the  moment  results  it  is  being  assumed  that  6  >  1 . )  $(5-1)  ^ 

is  typically  small  in  comparison  with  In  A0  and  will  be  omitted  in  the 
analyses. 


Retaining  assumption  2  and  using  Bayes  theorem 


,-i  .  1  ( m-Cx-ff'V  A60 

p  (o  ] m,  s)  exp  --  (  — — J  ~  — 


A60  „6+l 


CX  exp  - 


2VS- 


9(m-a ) -A 5  >  •  s  \  ■  .5  +  1 


if  p(6)  is  locally  constant.  The  form  of  the  distribution  of  6  is  that  of 
a  normal  kernel  multiplied  by  6°+^  .  Write  y  and  a  for  the  mean  and  stan¬ 
dard  deviation  respectively  of  the  normal  kernel  (not  of  6) :  that  is,  write 

y  =  [£  (m-a)  -  A6y2s  ]  £~2 

and 

a  =  ysS'1, 

so  that 

2c2 

A. 5.1  Approximations .  To  study  this  distribution  it  is  necessary  to 
evaluate  the  missing  constant  of  proportionality.  Writing  6°+1  =  (6-y+y) 
and  expanding  by  the  binomial  theorem  the  integral  of  (a. 9)  is  easily  ob¬ 
tained  as  a  finite  series  in  terms  of  the  known  moments  about  the  mean  for 

X 

a  normal  distribution.  Moments  may  similarly  be  found  since  E(6  )  leads  to 

the  same  integral  with  6  replaced  by  £  +  r.  Since  the  rth  moment  for  a 

x 

normal  distribution  is  proportional  to  o  ,  the  series  wall  be  in  terms  of 
a/y  =  X.  Consequently,  if  the  coefficient  of  variation,  o/y,  of  the  normal 
kernel  is  small,  reasonable  approximations  to  the  moments  can  be  obtained 
by  including  only  the  first  few  terms  of  the  series.  Straightforward,  but 
tedious,  algebra  shows  that 

E(6  |m,  s)  =  y{l  +  (6+1)  x2  +  0(x4)} 

and 

var  (6  |m,  s)  =  a2{  1  -  (6+1) T2  +  0(x4)} 

The  approximations  obtained  by  omitting  the  0(x4)  terms  will  only  be 
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reasonable  if  (6+1) t2  is  small.  Inserting  the  values  of  p  and  o  into  these 
approximation  yields 


A. 6  Other  Forms  of  Scale  Information 

We  next  make  some  comments  on  assumption  3.  There  are  other  possibilities: 
for  example  A2602/s2  may  be  supposed  to  have  a  I\  distribution.  This  is 
equivalent  to  working  in  terms  of  the  variance  s2  rather  than  the  standard 
deviation  as  assumption  3  does.  The  revised  assumption  implies  that 
E(s)  =  A0,  as  before,  except  that  this  is  only  approximate.  (E(s)  = 

A0(1  -  -^6)  is  a  better  approximation.)  The  coefficient  of  variation  is 

O 

approximately  26  ^ ,  as  against  (6-1)  ^  with  assumption  3.  Calculations 
with  this  assumption  yield  results  in  close  agreement  with  those  reported 
above  provided  the  6  of  the  new  method,  6^  say,  is  related  to  that  of  the 
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Both  these  assumptions  imply  that  s  increases  linearly  with  6.  Another 
possibility  is  that  the  variance  s2  increases  linearly  with  6.  In  analogy 

with  assumption  3,  it  can  be  supposed  that  AS6/s2  is  rr;  or  in  line  with 

c 

the  ideas  in  the  last  paragraph  A562/s4  is  r^.  Calculations  show  that 
the  effect  on  the  original  normal  distribution,  obtained  under  assumption 
1  of  s  alone  giving  no  information  about  0,  is  slight.  This  is  because 
the  standard  deviation  is  about  linear  with  0^,-  in  other  words,  the  rate 
of  change  of  s  with  0  is  so  slow  that  no  appreciable  disturbance  from 
normality  is  felt. 


A. 7  Several  Subjects,  Scale  Information 

With  several  subjects  suppose,  as  described  in  the  text,  that  on  the  evi¬ 
dence  of  the  means,  0  is  normal  with  mean  m  and  variance  s  2,  whereas 

0  0 

on  the  evidence  of  the  standard  deviation  alone  it  is  such  that  6c  /A. 

0  0 

is  T .  .  Then  the  final  distribution  is  the  product  of  these,  namely, 

V1 


of  the  same  form  as  studied  earlier. 
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